首页> 外文OA文献 >A Model for User-Oriented Data Provenance in Pipelined Scientific Workflows
【2h】

A Model for User-Oriented Data Provenance in Pipelined Scientific Workflows

机译:流水线科学工作流中面向用户的数据来源模型

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Integrated provenance support promises to be a chief advantage of scientific workflow systems over script-based alternatives. While it is often recognized that information gathered during scientific workflow execution can be used automatically to increase fault tolerance (via checkpointing) and to optimize performance (by reusing intermediate data products in future runs), it is perhaps more significant that provenance information may also be used by scientists to reproduce results from earlier runs, to explain unexpected results, and to prepare results for publication. Current workflow systems offer little or no direct support for these \u22scientist-oriented\u22 queries of provenance information. Indeed the use of advanced execution models in scientific workflows (e.g., process networks, which exhibit pipeline parallelism over streaming data) and failure to record certain fundamental events such as state resets of processes, can render existing provenance schemas useless for scientific applications of provenance. We develop a simple provenance model that is capable of supporting a wide range of scientific use cases even for complex models of computation such as process networks. Our approach reduces these use cases to database queries over event logs, and is capable of reconstructing complete data and invocation dependency graphs for a workflow run.
机译:与基于脚本的替代方案相比,集成的来源支持有望成为科学工作流系统的主要优势。人们通常认为,科学工作流程执行过程中收集的信息可以自动用于提高容错能力(通过检查点)并优化性能(通过在将来的运行中重用中间数据产品),但更重要的是,出处信息也可以科学家用来复制早期运行的结果,解释意想不到的结果并准备要发表的结果。当前的工作流系统很少或根本没有直接支持这些面向科学家的出处信息查询。实际上,在科学工作流程中使用高级执行模型(例如,流程网络在流数据上表现出流水线并行性)以及无法记录某些基本事件(例如流程的状态重置),可能会使现有的出处模式对出处的科学应用毫无用处。我们开发了一个简单的出处模型,即使对于复杂的计算模型(例如过程网络),也可以支持广泛的科学用例。我们的方法将这些用例简化为对事件日志的数据库查询,并能够为工作流运行重建完整的数据和调用依赖关系图。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号